314 ◾ Bioinformatics
-1 fastq_pure/ERR1823587_pure_R1-50.fastq.gz \
-2 fastq_pure/ERR1823587_pure_R2-50.fastq.gz \
--report-file centrifuge_out/ERR1823587-report.txt \
-S centrifuge_out/ERR1823587-results.txt
centrifuge -x p+h+v \
-1 fastq_pure/ERR1823601_pure_R1-50.fastq.gz \
-2 fastq_pure/ERR1823601_pure_R2-50.fastq.gz \
--report-file centrifuge_out/ERR1823601-report.txt \
-S centrifuge_out/ERR1823601-results.txt
centrifuge -x p+h+v \
-1 fastq_pure/ERR1823608_pure_R1-50.fastq.gz \
-2 fastq_pure/ERR1823608_pure_R2-50.fastq.gz \
--report-file centrifuge_out/ERR1823608-report.txt \
-S centrifuge_out/ERR1823608-results.txt
The results are saved in “*-results.txt” files. Each read classified by Centrifuge results in
a single line of output. The output lines consist of eight tab-delimited fields: (1) the read
ID (from FASTQ file); (2) sequence ID (from the database sequence); (3) taxonomic ID of
the database sequence; (4) classification score (weighted sum of hits); (5) score for the next
best classification; (6) two numbers: (i) a number of base pairs of the read that match the
database sequence and (ii) the length of a read or the combined length of mate pairs; (7)
two numbers: (i) a number of base pairs of the read that match the database sequence and
(ii) the length of a read or the combined length of mate pairs; and (8) the number of clas-
sifications for this read.
The “*-report.txt” files contain summaries of the identified taxa and their abun-
dances. Each line in the file consists of seven tab-delimited fields: The name of a genome,
FIGURE 8.2 Partial centrifuge report for the healthy sample.